智能论文笔记

On the Computational Complexity of Multi-Agent Pathfinding on Directed Graphs

Bernhard Nebel

分类：人工智能

2019-11-11

多年来，确定有向图的多代理途径的计算复杂性一直是一个空旷的问题。对于无方向的图，可以在多项式时间内确定可溶性，如八十年代所示。此外，最近已经证明，在多项式时间内可以解决有向图的特殊情况。在本文中，我们表明问题在一般情况下是NP-HARD。另外，一些上限得到了证明。

translated by 谷歌翻译

Zero-Shot Object Segmentation through Concept Distillation from Generative Image Foundation Models

Mischa Dombrowski , Hadrien Reynaud , Matthew Baugh , Bernhard Kainz

分类：计算机视觉

2022-12-29

Curating datasets for object segmentation is a difficult task. With the advent of large-scale pre-trained generative models, conditional image generation has been given a significant boost in result quality and ease of use. In this paper, we present a novel method that enables the generation of general foreground-background segmentation models from simple textual descriptions, without requiring segmentation labels. We leverage and explore pre-trained latent diffusion models, to automatically generate weak segmentation masks for concepts and objects. The masks are then used to fine-tune the diffusion model on an inpainting task, which enables fine-grained removal of the object, while at the same time providing a synthetic foreground and background dataset. We demonstrate that using this method beats previous methods in both discriminative and generative performance and closes the gap with fully supervised training while requiring no pixel-wise object labels. We show results on the task of segmenting four different objects (humans, dogs, cars, birds).

translated by 谷歌翻译

Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing

Justus Mattern , Zhijing Jin , Mrinmaya Sachan , Rada Mihalcea , Bernhard Schölkopf

分类：自然语言处理 | 机器学习

2022-12-20

Generated texts from large pretrained language models have been shown to exhibit a variety of harmful, human-like biases about various demographics. These findings prompted large efforts aiming to understand and measure such effects, with the goal of providing benchmarks that can guide the development of techniques mitigating these stereotypical associations. However, as recent research has pointed out, the current benchmarks lack a robust experimental setup, consequently hindering the inference of meaningful conclusions from their evaluation metrics. In this paper, we extend these arguments and demonstrate that existing techniques and benchmarks aiming to measure stereotypes tend to be inaccurate and consist of a high degree of experimental noise that severely limits the knowledge we can gain from benchmarking language models based on them. Accordingly, we propose a new framework for robustly measuring and quantifying biases exhibited by generative language models. Finally, we use this framework to investigate GPT-3's occupational gender bias and propose prompting techniques for mitigating these biases without the need for fine-tuning.

translated by 谷歌翻译

Walking Noise: Understanding Implications of Noisy Computations on Classification Tasks

Hendrik Borras , Bernhard Klein , Holger Fröning

分类：机器学习

2022-12-20

Machine learning methods like neural networks are extremely successful and popular in a variety of applications, however, they come at substantial computational costs, accompanied by high energy demands. In contrast, hardware capabilities are limited and there is evidence that technology scaling is stuttering, therefore, new approaches to meet the performance demands of increasingly complex model architectures are required. As an unsafe optimization, noisy computations are more energy efficient, and given a fixed power budget also more time efficient. However, any kind of unsafe optimization requires counter measures to ensure functionally correct results. This work considers noisy computations in an abstract form, and gears to understand the implications of such noise on the accuracy of neural-network-based classifiers as an exemplary workload. We propose a methodology called "Walking Noise" that allows to assess the robustness of different layers of deep architectures by means of a so-called "midpoint noise level" metric. We then investigate the implications of additive and multiplicative noise for different classification tasks and model architectures, with and without batch normalization. While noisy training significantly increases robustness for both noise types, we observe a clear trend to increase weights and thus increase the signal-to-noise ratio for additive noise injection. For the multiplicative case, we find that some networks, with suitably simple tasks, automatically learn an internal binary representation, hence becoming extremely robust. Overall this work proposes a method to measure the layer-specific robustness and shares first insights on how networks learn to compensate injected noise, and thus, contributes to understand robustness against noisy computations.

translated by 谷歌翻译

Steel Phase Kinetics Modeling using Symbolic Regression

David Piringer , Bernhard Bloder , Gabriel Kronberger

分类：机器学习 | 神经与进化计算

2022-12-19

We describe an approach for empirical modeling of steel phase kinetics based on symbolic regression and genetic programming. The algorithm takes processed data gathered from dilatometer measurements and produces a system of differential equations that models the phase kinetics. Our initial results demonstrate that the proposed approach allows to identify compact differential equations that fit the data. The model predicts ferrite, pearlite and bainite formation for a single steel type. Martensite is not yet included in the model. Future work shall incorporate martensite and generalize to multiple steel types with different chemical compositions.

translated by 谷歌翻译

Bayesian posterior approximation with stochastic ensembles

Oleksandr Balabanov , Bernhard Mehlig , Hampus Linander

分类：机器学习 | 计算机视觉 | (统计)机器学习

2022-12-15

We introduce ensembles of stochastic neural networks to approximate the Bayesian posterior, combining stochastic methods such as dropout with deep ensembles. The stochastic ensembles are formulated as families of distributions and trained to approximate the Bayesian posterior with variational inference. We implement stochastic ensembles based on Monte Carlo dropout, DropConnect and a novel non-parametric version of dropout and evaluate them on a toy problem and CIFAR image classification. For CIFAR, the stochastic ensembles are quantitatively compared to published Hamiltonian Monte Carlo results for a ResNet-20 architecture. We also test the quality of the posteriors directly against Hamiltonian Monte Carlo simulations in a simplified toy model. Our results show that in a number of settings, stochastic ensembles provide more accurate posterior estimates than regular deep ensembles.

translated by 谷歌翻译

Towards Hardware-Specific Automatic Compression of Neural Networks

Torben Krieger , Bernhard Klein , Holger Fröning

分类：机器学习 | (统计)机器学习

2022-12-15

Compressing neural network architectures is important to allow the deployment of models to embedded or mobile devices, and pruning and quantization are the major approaches to compress neural networks nowadays. Both methods benefit when compression parameters are selected specifically for each layer. Finding good combinations of compression parameters, so-called compression policies, is hard as the problem spans an exponentially large search space. Effective compression policies consider the influence of the specific hardware architecture on the used compression methods. We propose an algorithmic framework called Galen to search such policies using reinforcement learning utilizing pruning and quantization, thus providing automatic compression for neural networks. Contrary to other approaches we use inference latency measured on the target hardware device as an optimization goal. With that, the framework supports the compression of models specific to a given hardware target. We validate our approach using three different reinforcement learning agents for pruning, quantization and joint pruning and quantization. Besides proving the functionality of our approach we were able to compress a ResNet18 for CIFAR-10, on an embedded ARM processor, to 20% of the original inference latency without significant loss of accuracy. Moreover, we can demonstrate that a joint search and compression using pruning and quantization is superior to an individual search for policies using a single compression method.

translated by 谷歌翻译

Evaluating vaccine allocation strategies using simulation-assisted causal modelling

Armin Kekić , Jonas Dehning , Luigi Gresele , Julius von Kügelgen , Viola Priesemann , Bernhard Schölkopf

分类：人工智能

2022-12-14

Early on during a pandemic, vaccine availability is limited, requiring prioritisation of different population groups. Evaluating vaccine allocation is therefore a crucial element of pandemics response. In the present work, we develop a model to retrospectively evaluate age-dependent counterfactual vaccine allocation strategies against the COVID-19 pandemic. To estimate the effect of allocation on the expected severe-case incidence, we employ a simulation-assisted causal modelling approach which combines a compartmental infection-dynamics simulation, a coarse-grained, data-driven causal model and literature estimates for immunity waning. We compare Israel's implemented vaccine allocation strategy in 2021 to counterfactual strategies such as no prioritisation, prioritisation of younger age groups or a strict risk-ranked approach; we find that Israel's implemented strategy was indeed highly effective. We also study the marginal impact of increasing vaccine uptake for a given age group and find that increasing vaccinations in the elderly is most effective at preventing severe cases, whereas additional vaccinations for middle-aged groups reduce infections most effectively. Due to its modular structure, our model can easily be adapted to study future pandemics. We demonstrate this flexibility by investigating vaccine allocation strategies for a pandemic with characteristics of the Spanish Flu. Our approach thus helps evaluate vaccination strategies under the complex interplay of core epidemic factors, including age-dependent risk profiles, immunity waning, vaccine availability and spreading rates.

translated by 谷歌翻译

HOOD: Hierarchical Graphs for Generalized Modelling of Clothing Dynamics

Artur Grigorev , Bernhard Thomaszewski , Michael J. Black , Otmar Hilliges

分类：计算机视觉

2022-12-14

We propose a method that leverages graph neural networks, multi-level message passing, and unsupervised training to enable real-time prediction of realistic clothing dynamics. Whereas existing methods based on linear blend skinning must be trained for specific garments, our method is agnostic to body shape and applies to tight-fitting garments as well as loose, free-flowing clothing. Our method furthermore handles changes in topology (e.g., garments with buttons or zippers) and material properties at inference time. As one key contribution, we propose a hierarchical message-passing scheme that efficiently propagates stiff stretching modes while preserving local detail. We empirically show that our method outperforms strong baselines quantitatively and that its results are perceived as more realistic than state-of-the-art methods.

translated by 谷歌翻译

On the Relationship Between Explanation and Prediction: A Causal View

Amir-Hossein Karimi , Krikamol Muandet , Simon Kornblith , Bernhard Schölkopf , Been Kim

分类：机器学习 | (统计)机器学习

2022-12-13

Explainability has become a central requirement for the development, deployment, and adoption of machine learning (ML) models and we are yet to understand what explanation methods can and cannot do. Several factors such as data, model prediction, hyperparameters used in training the model, and random initialization can all influence downstream explanations. While previous work empirically hinted that explanations (E) may have little relationship with the prediction (Y), there is a lack of conclusive study to quantify this relationship. Our work borrows tools from causal inference to systematically assay this relationship. More specifically, we measure the relationship between E and Y by measuring the treatment effect when intervening on their causal ancestors (hyperparameters) (inputs to generate saliency-based Es or Ys). We discover that Y's relative direct influence on E follows an odd pattern; the influence is higher in the lowest-performing models than in mid-performing models, and it then decreases in the top-performing models. We believe our work is a promising first step towards providing better guidance for practitioners who can make more informed decisions in utilizing these explanations by knowing what factors are at play and how they relate to their end task.

translated by 谷歌翻译